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ABSTRACT 

We use numerical simulations to study the effects of the patchiness of a partly reionized inter- 
galactic medium (IGM) on the observability of Lya emitters (LAEs) at high redshifts (z > 6). 
We present a new model that divides the Lya radiative transfer into a (circum-)galactic and 
an extragalactic (IGM) part, and investigate how the choice of intrinsic line model affects 
the IGM transmission results. We use our model to study the impact of neutral hydrogen on 
statistical observables such as the Lya restframe equivalent width (REW) distribution, the 
LAE luminosity function and the two-point correlation function. We find that if the observed 
changes in LAE luminosity functions and equivalent width distributions between z ^ Q and 
z ~ 7 are to be explained by an increased IGM neutral fraction alone, we require an extremely 
late and rapid reionization scenario, where the Universe was ~ 40 % ionized at z = 7, ~ 50 % 
ionized at z = 6.5 and ~ 100 % ionized at z = 6. This is in conflict with other observations, 
suggesting that intrinsic LAE evolution at z > 6 cannot be completely neglected. We show 
how the two-point correlation function can provide more robust constraints once future ob- 
servations obtain larger LAE samples, and provide predictions for the sample sizes needed to 
tell different reionization scenarios apart. 

Key words: galaxies:high-redshift — galaxies:statistics — radiative transfer — methods: nu- 
merical 



1 INTRODUCTION 

The epoch of reionization (EoR) is currently one of the major fron- 
tiers in observational cosmology. It marks the last big phase transi- 
tion of the Universe, when the intergalactic medium (IGM) went 
from neutral to ionized. It is generally believed (e.g. Robertson 
|et al.|2010) that the first stars and galaxies played a dominant role in 
ionizing the IGM, but the exact nature and timing of this important 
event in the history of the Universe have so far remained elusive. 

Current observations of the EoR are all indirect, and do not 
provide very strong constraints. Spectra from high-redshift quasars 
show absorption bluewards of the Lya wavelength — so-called 
Gunn-Peterson troughs ( |Gunn & P eterson 19651 — implying that 
the Universe did not completely reionize until 2 ~ 6 I F an et aL| 
|2006| >. Measurements of the Thompson scattering optical depth 
from the WMAP satellite imply a reionization redshift of z re w 10 
( Komatsu et al. 2009 ), but are consistent also with various extended 
reionization scenarios. Measurements of the IGM temperature sug- 
gest that reionization finished no earlier than z ~ 10 and no later 
than z ~ 6.5 ( |Theuns et al.|2002[ [Hui & Haiman(2003] |Raskutti| 
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|et al.|2012) . In the future, direct constraints on the EoR are expected 
to come from observations of the redshifted 21 cm line emitted by 
the (partly) neutral IGM. Radio telescopes such as LOFAR (LOw 
Frequency ARra^, PAPER (Precision Array to Probe Epoch of 
Reionization^}, 21CMA (21 cm ArrajQ and MWA (Murchison 
Widefield Arraj|^} are just beginning observations that will mea- 
sure statistical properties of the 21 cm radiation, and the planned 
SKA (Square Kilometer Arra}^} will be able to image the process 
directly. 

Meanwhile, galaxies with Lya emission have started attract- 
ing attention as an additional indirect probe of the later stages of 
the EoR. The resonant nature of the Lya line makes it sensitive to 
even small amounts of neutral hydrogen, and we thus expect the 
observed properties of Lya emitters (LAEs) to change at higher 
redshifts, when the IGM was on average more neutral. 

In the last few years, there have been many efforts to ob- 
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tain large high-z samples of LAEs (e.g. Malhotra & Rhoads 2004 
|Ouchi et al.1[20T0l |Hu et al.|[20Tol |Kashikawa et al.||2011> ,"us- 
ing narrow-band photometric observations and in some cases with 
spectroscopic confirmations. We are now starting to get sizable 
samples at redshifts where reionization might still be ongoing. So 
far however, attempts to obtain constraints on the global IGM neu- 
tral fraction from these samples have produced somewhat different 
and contradictory results. 

There are a number of ways one could potentially study the 
EoR using LAEs. Some studies (Dayal & Ferrara 2011, Dijkstra 
|et al.|20PTj l try to constrain the ionization state of the IGM based 
on individual detections of very high-z LAEs (Lehne rTet al.|2010| >. 
The reasoning is that the mere observation of a single LAE would 
set an upper limit on the amount of neutral hydrogen present around 
the galaxy — were it too high, the galaxy would not have been 
observable. 

A more common approach, and the one used in this paper, is 
to focus instead on the effects that an increasingly neutral IGM at 
higher redshifts will have on the statistical properties of the pop- 
ulation of LAEs. Perhaps the most obvious effect is that the in- 
creased H I density at higher redshifts is expected to lead to a de- 
crease in the fraction iL yo of UV-selected galaxies that also show 
Lya emission ( |Kashikawa et al. 2006 ). While the observed LAE 
fraction increases with redshift up to z ~ 6 — likely due largely 
to the decreasing dust content in earlier galaxies (e.g. Sta rk et aE] 
|20T0l >— there are some tantalising observational indications that 
iLya might start to decrease for even higher z (|Ota et al.|[2008| 
IStark et al.|20Tol|Pentericci et al.|201 l||Ono et al'|2012||Schenker| 
|et al.||20"T2| l. |Hayes et al.| |20TTJ see a corresponding sharp drop 
in the global Lya escape fraction. However, the uncertainties are 
large, and not everyone sees this effect (e.g. Tilv i et al.|2010[ >. 

An observable related to the LAE fraction is the LAE lumi- 
nosity function (LF). Observations show very little evolution in the 
LAE LF up to z ~ 6 JMalhotra & Rhoads|2004")|Ouchi et al.|2008| 
|Tilvi et ak] 2010l. If Lya radiation becomes attenuated by an in- 
creasingly neutral IGM at higher redshifts, we expect the LF to start 
dropping in amplitude. There are indeed some observations that in- 
dicate this ( |Ouchi et al.|2010||Kashikawa et al.|2011) , although the 
data remains limited, especially at the faint and bright ends of the 
LF. If there is a connection between the Lya luminosity of a galaxy 
and the transmitted fraction of Lya photons, this could also change 
the shape of the LF. 

A third potentially useful observable is the two-point corre- 
lation function of LAEs, which measures the clustering of these 
galaxies in the sky. The patchiness of the reionization process is 
expected to give rise to an increase in the clustering of galaxies 
l |McQuinn et al.|2007| [Tliev et al.|2008a||Tilvi et al.|2009|[OucnTl 
|et al.|2010) . Since the Lya transmission depends on the ionization 
state of the IGM in the vicinity of a galaxy, observations will favour 
galaxies located within large H II regions. Thus, the two-point cor- 
relation function will show a stronger clustering for a sample of 
LAEs than for a UV-selected sample with the same number den- 
sity. 

Accurate numerical modelling of observed LAE populations 
at very high redshifts is a computationally challenging task. It in- 
volves simulating the growth of structure in the early Universe, the 
patchy reionization of the IGM (where many essential parameters 
are still poorly constrained), star formation in the first galaxies, 
and finally the complicated radiative transfer (RT) of Lya photons. 
Most previous studies use simplified semi-analytical models of the 
IGM ( |Dayal et al.||2008l |Tilvi et al.|p009] |Dijkstra et aLl[20TT), 
or very simple models of the Lya emitters themselves {McQ uinn 



|et al.|2007||Iliev et al.|2008a) . Others use detailed models of both 
the IGM and the LAEs (Zheng et al. 2010 1, which comes at the cost 
of a higher computational complexity, resulting in a smaller part of 
parameter space being explored. 

In this paper, we use numerical models of the IGM combined 
with a simple recipe for Lya line shapes — based on detailed hy- 
drodynamical simulations of galaxies — to obtain predictions for 
the statistical observables described in the introduction. Like Zheng 
|et al.lpOlO) , we only consider one specific reionization scenario, 
but by simplifying the radiative transfer of Lya through the IGM 
(see Sec. [23} we are able to cover a large range of IGM ionized 
fractions. This study is similar in nature to |Iliev et al.| ( |2008"a) , but 
uses an updated version of the reionization code, a bigger simulated 
box and a novel method for separating the complicated Lya radia- 
tive transfer in the close vicinity of galaxies from the more straight- 
forward calculations in the low-density IGM. Whereas Ilie v et ak] 
(2008a) simply assumed a Gaussian Lya line shape for all emitters, 
we use a recipe for double-peaked profiles with little emission at the 
line centre, motivated by spectra from detailed radiative transfer in 
the vicinity of galaxies taken from cosmological Smooth Particle 
Hydrodynamics (SPH) simulations. We compare the results from 
this line profile to the simple Gaussian model, and discuss some 
implications of various assumptions regarding the choice of line 
shape model. 

The outline of the paper is as follows: in Section|2]we describe 
our simulations and the assumptions that went into the modelling. 
In Section [5] we present the results from the simulations. We dis- 
cuss the line shape model used and the implications of using dif- 
ferent line models, and the effect of the IGM on the observed line 
shapes. We show our simulated luminosity functions, equivalent 
width distributions and correlation functions and compare these to 
observations where available. Finally, in Section]?] we summarise 
and discuss our results. 

The cosmological parameters used in the simulations 
were for a flat ACDM model with (f2 m , fib, h, n, erg) = 
(0.27, 0.044, 0.7, 0.96, 0.8), consistent with the 5 year WMAP re- 
sults JKomatsu et al.|2009> . 



2 SIMULATIONS 

Our simulations were carried out in four separate steps. First, a 
cosmological N-body simulation was performed to obtain a time 
evolving density field and dark matter halo lists. The density field 
was then used to simulate the transfer of ionizing radiation and the 
growth of the H II regions surrounding the haloes. Separately from 
this, a number of galaxies were simulated in detail with a hydro- 
dynamical code, modelling the complicated Lya radiative transfer 
in the close vicinity of the galaxies. We used the spectra derived 
from the hydrodynamical simulations to get a picture of what the 
Lya line looks like at the edge of the galaxies, and how it varies in 
shape with halo mass in order to devise a simplified line model. Fi- 
nally, after assigning Lya spectra to the dark matter haloes from our 
N-body simulations, we calculated the optical depth, r(A), through 
the IGM by tracing a large number of sightlines from each galaxy 
through our simulation box, taking into account the variations in 
density, ionized fraction and gas peculiar velocities. 

2.1 N-body simulations 

The N-body structure formation simulations were performed with 
the massively-parallel, hybrid (MPI+OpenMP) code CubeP 3 M 
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Figure 1. Output from our IGM simulations at a global ionized fraction (x) m = 0.67. The figures show a 1-cell thick (= 0.64 cMpc) slice of the IGM. 
All figures are 163 cMpc across. Top left: Ionized fraction (white is ionized, black is neutral). Top right: H I +H II density. Bottom left: Halo positions for 
Mh > 10 10 Mq. Bottom right: Neutral hydrogen density. 



pliev et al.||2008b| |Harnois-Deraps et al.||2012b , developed from 
the PMFAST code {Merz et al.||2005| >~CUBEP 3 M is a particle- 
particle-particle-mesh code which calculates gravitational force in- 
teractions by combining long-range forces calculated on a mesh 
with short-range direct forces between particles. For efficiency rea- 
sons the long-range forces are calculated on two grids: a fine grid 
covering the local domain, and a global coarse grid. 



2.2 Radiative transfer of ionizing radiation 

For the simulation of the radiative transfer of ionizing radiation 
from the galaxies identified in the previous step, we use d the code 
C 2 -RAY — Conservative, Causal Ray-tracing method I Mellema 
et al. 2006b. In C 2 -RAY, each source — taken from the N-body 
runs described above — is given a flux proportional to its mass, 
Mh, so that the photoionization rate from each source at a distance 



A spherical over-density method was used to identify dark 
matter haloes. This method starts by identifying local maxima in 
the fine grid density distribution, and proceeds by enclosing these 
in successively larger spheres until the average density inside the 
sphere is below 178 times the global mean density. The N-body 
code was run on a cube with a side of 163 comoving Mpc (here- 
after cMpc) with a fine grid of 6144 3 points using 3072 3 particles, 
and finally regridded onto a grid with 512 3 cells. This box size 
is large enough that cosmic variance effects are small (Barkana & 
|Loeb|20 04 ). The smallest resolved haloes consisted of 20 particles, 
which gives a minimum halo mass of 10 8 Mq , corresponding to the 
mass at which atomic cooling becomes important (see |Iliev et al.| 
|20l"2] for details). 



where a v is the ionization cross-section for hydrogen, t v is the 
optical depth, m p is the proton mass and v t h is the threshold en- 
ergy for ionization of hydrogen. The factor g 7 is a model parame- 
ter which determines the halo mass - ionizing flux relation. More 
precisely, it is the number of ionizing photons escaping the galaxy 
per halo baryon per 10 Myr, which depends on assumptions about 
the initial mass function, star formation efficiency and escape frac- 
tion of the galaxies. We assume that the baryonic density follows 
the dark matter density, and we do not consider contributions to the 
ionization by other sources, such as QSOs. 
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The time evolution for the ionized fraction of hydrogen, Xi, is: 



dxi 
~dt 



= (1 - Xi)(F + n e C H ) - Xin e ceH, 



(2) 



where n e is the electron density, Ch is the collisional ionization co- 
efficient and an is the recombination coefficient. The on-the-spot 
approximation is used, i.e. all recombinations to the ground state 
are assumed to be locally absorbed, so that an = &b- Eq. {2} is 
solved by C 2 -RAY through an iterative process, which is necessary 
since Y can have contributions from many sources. This iteration 
is carried out for all sources in a photon-conserving manner, over a 
(163 cMpc) 3 box, discretised into a grid with 256 3 cells, using the 
density field from the N-body simulations as input. In the end, we 
obtain density, gas peculiar velocity, ionized fraction and ionizing 
flux for each cell (see Fig. [I}. For a detailed description of C 2 -RAY 
along with several comparisons against exact analytical solutions 
and other radiative transfer codes, see Mellema et al. (2006) and 
|Iliev et al.||2006) . 

In this study, we consider a model where g-, = 1.7 and where 
small sources (M^ < 10 9 A/q) become suppressed as soon as the 
IGM in their vicinity is ^ 10% ionized. This effect is known as 
self-regulation and is motivated by the fact that smaller sources lack 
the gravitational well needed to keep accreting material in an ion- 
ized and heated environment, and so will essentially stop forming 
stars as their neighbourhood becomes ionized piiev et aI7||2007| >. 
The g 7 = 1.7 model gives a fairly late reionization, consistent 
with observations of quasar spectra. The evolution of the global 
ionized fraction for hydrogen is illustrated in Fig. [2] which shows 
the global ionized fraction averaged by mass and by volume. The 
mass- weighted ionized fraction, (a;} m is higher than the volume- 
weighted fraction, indicating that the high-mass regions with many 
galaxies are the ones to become ionized first. This is known as 
inside-out reionization. 

Due to this finite resolution, small scale density fluctuations 
in the IGM are smeared out. Since recombinations scale with 
nmin e , the effects of these density fluctuations are often included 
by boosting the recombination term with a clumping factor, C = 
(n 2 )/(n) 2 . The value of this term depends on baryonic physics 
and will be different between cold and photo-heated gas. Including 
a maximal clumping factor based on dark matter clumping down 
to scales ~ 10 5 Mq is found to extend reionization by Az ~ 1 
(Mao et al., in prep.). Lyman limit systems are the denser struc- 
tures which contain enough H I to achieve an optical depth > 1. 
They will limit the distance ionizing radiation can travel. Unfortu- 
nately, the evolution of this mean free path during the EoR is not 
known, but their presence is likely to delay the end of reionization 
by making it more difficult for ionized regions larger than the mean 
free path to merge with others. 

The simulation used here does not include the effects of gas 
clumping and Lyman limit systems and therefore likely overesti- 
mates the redshift of the end of reionization. On the other hand 
there is considerable uncertainty in the luminosity of ionizing ra- 
diation escaping from galaxies and more efficient sources would 
reionize the Universe earlier. Given these uncertainties, we will re- 
fer to our simulated IGM boxes by mass-weighted ionized frac- 
tion, (a;} m , rather than redshift. Effectively, we are assuming that 
the curve in Fig.|2]could in reality be translated slightly along the 
x-axis, i.e. reionization could be taking place somewhat earlier or 
later than in our simulations. We believe that this is reasonable if the 
actual topology of reionization is not substantially different from 
our model ( |Friedrich et al.|201 l[[Tliev et al.|20l2] >. 




Figure 2. Evolution of the ionization state of the IGM during the late stages 
of our simulations. The dotted line shows the mean ionized fraction by vol- 
ume, and the dashed line shows the mean ionized fraction by mass. 



2.3 Lya radiative transfer through the IGM 

While the general problem of Lya radiative transfer is complex 
and computationally demanding (e.g. Zhen g et al.|2010[ >, it can be 
greatly simplified in the low-density IGM far away from the galaxy 
where the radiation was emitted. In the high-density region close 
to the galaxy, photons are scattered in and out of the line of sight, 
undergoing frequency changes on the way. Simulating this process 
requires very high resolution and many assumptions regarding, for 
instance, dust content and star formation. However, as was shown 
in |Laursen et al.| ( [2011| >, after a distance of ~ 1.5 times the virial 
radius of the galaxy, very few photons are scattered into the line 
of sight, and therefore it is justified to divide the radiative transfer 
into two regimes: a galaxy part where photons diffuse out of the 
optically thick gas around the halo (discussed in Sec. |2.4) , and an 
IGM part where we consider only scatterings out of the line of sight 
(effectively absorption). We set the boundary between these two 
regimes to be at 1.5 r- v j r . This division is schematically outlined in 
Fig.|3] 

The (circum-)galactic part will be discussed in the next sec- 
tion. For the extragalactic part, we use a modified version of the 
code IGMTRANSFER (Laursen et al.|2011) . IGMTRANSFER cal- 
culates the Lya transmission function -F(A) for a given wavelength 
interval, i.e. the transmitted fraction of radiation as a function of 
wavelength. Note that this is independent of the shape of the emis- 
sion line. 

The transmission function is given by: 



F(X) 



r(\) 



(3) 



where the optical depth, r, is calculated by stepping through the 
box along a line of sight, while summing the r from each cell: 



r (X) = ^2 nm,iO- Lya (X + Xv^^/cjAr 



(4) 



Here, Ar is the length of each step, 7im,i is the neutral hydrogen 
density, <TL ya is the neutral hydrogen cross section to Lya scat- 
tering and viii is the gas velocity component of the cell along the 
line of sight, including the Hubble flow. The sightline is traced until 
photons at 1190 A — well outside the Lya line profiles considered 
here — have redshifted into resonance. The IGM temperature was 
assumed to be 10 4 K everywhere. 

In the original version of IGMTRANSFER, photons are emit- 
ted in random directions and bounce randomly as they hit the edge 
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2.4 Lya line model 




Figure 3. The two different radiative transfer regimes in our model. In the 
dense medium inside and close to the halo, the line — initially with a Voigt 
profile — can become broadened and shifted and typically obtains an ab- 
sorption feature at the Lya line centre. In this optically thick regime, pho- 
tons scatter both into and out of the line of sight. As the radiation enters the 
less dense IGM, scatterings into the line of sight become extremely rare, and 
it becomes justified to consider only scatterings out of the sightline. Figure 
adapted from Laursen 1 2010 1. Note that the colours are only for telling dif- 
ferent photons apart; they are not related to photon wavelengths. 



of the box. Since our simulated box uses periodic boundary con- 
ditions, we modified the code so that sightlines travel through the 
periodic box instead of bouncing. We also added the possibility 
to more finely sample the IGM box. While the original version 
of IGMTRANSFER is limited to sampling each cell only once, we 
now divide each cell into many smaller subdivisions (Ar in Eq. 
[4)1, which gives more reliable results for F(A^1 Furthermore, we 
added the possibility to specify sightline directions, rather than hav- 
ing photons emitted in random directions. 

A cosmological simulation unavoidably involves a trade-off 
between resolution and volume. Bolton & Becker (2009) studied 
the impact of resolution and box size on various observables in 
cosmological hydro-simulations at 2 < z < 5 and found that a gas 
particle mass resolution of M ~ 10 5 ~ 6 M Q is needed for the sim- 
ulations to converge. However, at the high redshifts that we have 
been studying, the neutral fraction in the lowest-density regions, 
i.e. far from the galaxies, first of all is still sufficiently high that vir- 
tually no radiation is transmitted, and second corresponds to wave- 
lengths in the transmitted spectra very far from the Lya line centre. 
Conversely, the regions that do affect the line are so close to the 
galaxies that they typically are much better resolved in the under- 
lying N-body simulations. 

In general, a lower resolution will smear out the underlying 
density field and give a higher optical depth, as seen in Bolton & 
|Becker| ( |2009| >. However, a higher resolution may also start to re- 
solve very dense, self-shielded objects that give rise to damping 
wings and actually lower the transmission, as shown recently by 
Bolton & Haehnelt (2012). Both of these studies use a spatially 
uniform ultraviolet background, and it remains to be seen exactly 
how the results will be affected by a more realistic, fluctuating ion- 
izing flux. 



6 While we still assume nm to be the same across the cell, the Hubble flow 
velocity can vary considerably across a cell for coarser resolutions. If the 
cell is only sampled once, this will result in an unphysical shape of r(A). 



From the previous steps we have snapshots of the IGM and dark 
matter density, gas peculiar velocities, halo lists and a method for 
calculating the transmission of Lya in the low-density IGM regime. 
The only missing piece of the puzzle is the emitted spectra J om (A), 
i.e. the Lya spectra as they look as they emerge from the galaxies, 
here taken to be at a distance 1.5 r v i r from the galaxy centre, as 
discussed earlier. These are to be multiplied by F(X) to get the 
observed spectra. 

Some previous studies ( [Dijkstra et al.| 2007 ; McQuin n et al.| 
2007; |Ilievetal.|2008a) simply assume a Gaussian line shape, and 
apply Eq. ^ directly to this. While this may give a rough picture of 
the effects of the IGM, it ignores the fact that Eq. ^ is only valid 
in the very low density IGM. For the (circum-)galactic part (within 
1.5 r v ir), full radiative transfer is required, along with a detailed 
model of the structure of the galaxies. 

We will use the term intrinsic line to mean the Lya line at the 
border between the ISM and the IGM. For our fiducial line model 
(introduced below), this is taken to be 1.5 r v - 1T . However, when we 
compare to the Gaussian line results, we will start the sightlines at 
the halo centres, essentially treating the halo as part of the IGM. 
This is done to facilitate fair comparisons to earlier studies, such as 
|Iliev et al.|p008al >. 

In the case of a spherically symmetric, homogeneous H I dis- 
tribution, the problem of Lya radiative transfer can be solved ana- 
lytically (Dijkstra et al. 2006a). The resulting line shape emerging 
from the galaxy has the form: 



Jem («£ ) 



24ar \1 + cosh( ^/2tt 3 /27j x 3 /ar ) 



(5) 



where x is the frequency shifted to the Lya line centre and divided 
by the Doppler width of the line, a is the Voigt parameter and to 
is the optical depth to the surface of the galaxy. This line is plotted 
in Fig.|4]for some typical galaxy parameters (dotted red line in the 
bottom part of the figure). 

Observed Lya lines will deviate significantly from Eq. {5} 
since real galaxies are neither homogeneous nor spherical but disk- 
like and clumpy and may have complicated outflows and inflows. 
To investigate how these factors complicate the spectra, we isolated 
a smaller, high-resolution, sample of ~ 2000 galaxies from a sepa- 
rate cosmological SPH simulation at four different redshift epochs 
(z — [6.0, 7.0, 7.8, 8.8]) and ran detailed Lya RT calculations on 
these. 

The cosmological simulation is similar to the ones outlined in 
|Laurse n et al. ( 201 1 ), which themselves are a significant improve- 
ment on the ones described in |Sommer-Larsen et al.| ( [2003| > and 
|Sommer^La rsen (2006). The reader is referred to these papers for 
a detailed description of the code; in short, it is a TreeSPH code 
using self-consistent, ab initio hydro/gravity simulations to calcu- 
late the structure in a spherical region of linear dimension 10/i 
cMpc. The mass of the SPH and DM particles were 7.0 x 10 5 
Mq and 3.9 x 10 6 Mq, respectively, and the minimum smoothing 
length ~ 100 pc. A simplified ionizing UV RT scheme is invoked, 
modelling the meta-galactic UV background (UVB) after Haardt & 
|Madau| ( [19961 . 

The galaxies span several orders of magnitude in stellar mass; 
from 10 8 to 10 10 Mq. Before the Lya RT, the physical parameters 
of the gas particles (neutral hydrogen density, temperature, bulk ve- 
locity, dust density, and Lya emissivity from both cooling radiation 
and recombinations following ionization; at these high redshifts, 
the two processes contribute roughly equally to the total Lya lu- 
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Figure 4. Some examples of simulated intrinsic Lya line profiles (top), and 
the various simplified models discussed in the text (bottom). All line profiles 
are normalised with arbitrary units on the y-axis. 

minosity) are first interpolated onto a mesh of adaptively refined 
cells, using their original kernel function. The resulting galaxies 
then each consist of 10 2 ~ 3,5 cells. However, although the physical 
size of the smallest cells is ~ 10 pc, typically the best resolution 
is in fact some 100 pc. Finally, the full RT is conducted using the 
Monte Carlo code MoCaLaTA ( |Laursen et al. 2009a]bJ, calcu- 
lating the spectra emerging at a distance 1.5 virial radii from the 
centre of the galaxies by tracing individual photon packets as they 
scatter out through the ISM. 

Simulating Lya production and radiative transfer in very high- 
z galaxies involves many poorly constrained parameters such as 
star formation rates, dust content, ISM ionization structure and gas 
dumpiness. While the simulations in our high-resolution galaxy 
sample give a fairly detailed picture of the Lya line, the resolu- 
tion is still too low to fully trust all aspects of the results as-is. For 
instance, we find that the values of the Lya luminosities are some- 
what too low to match observations. However, we assume that the 
general shapes of the emerging Lya lines and their dependence on 
halo mass are reasonable. 

A small number of the simulated lines are plotted in the top 
panel of Fig. [4] We find that the lines in general have a double- 
peaked structure, with close to zero emission at the line centre, 
in agreement with Eq. (J3}. However, the two peaks are generally 
asymmetric, falling off more slowly away from the line centre. 
We also find that the lines tend to become broader for higher-mass 
galaxies. This general line shape is familiar from both observations 
and other simulations (e.g. |Tapken et al.||2007{ Verhamm e et al.| 
|2008|[Yamadaet al.|2012} . The presence of neutral hydrogen in the 
ISM makes it difficult for Lya photons to escape before they have 
scattered and become red- or blueshifted away from the line centre. 

While not all observed lines show the double-peaked line pro- 
file, they typically have most of the emission at wavelengths away 
from the line centre. This is in contrast to the Gaussian model, 
where the peak of the emission is exactly at the line centre. As 
we shall see in Sec. [5] this difference has a significant effect on the 
Lya transmission through the IGM. 

To model the line shapes from our high-resolution galaxy sam- 
ple, we adopt a Gaussian-minus-a-Gaussian (GmG) line shape with 
a width that depends on the halo mass: 

J om (A) OC e -(^of/2,j _ e -(X-X f/2„i (f)) 

By fitting such lines to all the spectra in our high-resolution galaxy 
sample, we find that o\ and 02 both increase with halo mass, 



roughly as: 

(Ji = -6.5 + 0.75 log M h /M e (7) 
cr 2 = -3.2 + 0.35 log M h /M e , (8) 

that is, the line shapes tend to become broader for more massive 
galaxies. 

While somewhat ad-hoc, the GmG model appears to repro- 
duce the simulated line shapes rather well. As we will show in Sec. 
[3] the important distinction when it comes to IGM transmission is 
whether most of the emission is at the line centre (as in the Gaus- 
sian model) or offset from the line centre (as in the GmG model), 
and not in the exact shape of the peaks. Since low-z observations, 
analytical models and simulations all tend to show line shapes with 
either double peaks or a single peak on the blue side of the line 
centre (e.g. | Verhamme et alj2008l|Dijkstra et al.|2006a|b[|Wilman| 
|et al.|200"5] >, we adopt Eqs. |6]), l|7j and ([8j as our fiducial model. 

Observed Lya line profiles — in contrast to our model spectra 
— often have one peak that is stronger than the other. Gas that is 
falling into the galaxy may cause the red peak to become weaker 
than the blue one, and outflows will have the opposite effect (e.g. 
Dijkstra et al. 201 1 1. Such asymmetries can have a large effect on 
the absolute value of the fraction of Lya photons that are trans- 
mitted through the IGM. However, in this paper we calibrate our 
intrinsic luminosities against observations (see Sec. |3.2.1[ (, which 
means that the absolute values of the transmitted fractions have lit- 
tle or no effect on our results. What we are interested in is mainly 
the dependence of the transmission on quantities such as the lo- 
cal H I density and halo mass. These dependencies will be largely 
unaffected by asymmetries in the peaks. 

In our analyses, we generally make the assumption that the 
intrinsic properties of LAEs (i.e. luminosities and Lya line shapes) 
do not change during the time interval we are studying. This is a 
strong assumption, and one that is widely debated. Kashikawa et al. 
< |20 1 1 1 > argue that it is probably reasonable at least up to z = 6.5 
given that observations of the UV luminosity function — which is 
insensitive to the IGM — stays the same between z = 5.7 and z = 
6.5 within the error bars. Others, such as |Ouchi et al.| ( f2~010| >, do 
observe evolution in the UV LF, implying that the changes in LAE 
populations are due at least in part to intrinsic source evolution. 
In Sec. [4] we discuss the implications of the no-intrinsic-evolution 
assumption on our results. 

3 RESULTS 

3.1 General effects of the IGM 

Using the methods described above, we assign spectra according to 
Eqs. j6), l|7) and l[8j to all haloes in our box with M h > 10 10 M o . 
We then trace 50 sightlines per halo in different directions through 
the IGM box and calculate the IGM transmission function for each 
one. 

Even a very small amount of neutral hydrogen is enough to ab- 
sorb virtually all radiation bluewards of Lya. For a highly ionized 
IGM, with only some very small amount of residual H I, the trans- 
mission F(X) resembles a step function. At higher neutral fractions 
(and for certain individual sightlines in the more ionized cases) the 
transmissions begin showing damping wings that absorb some flux 
on the red side of the line centre, as illustrated in the top panel of 
Fig.0 

The middle panel of Fig.|5]shows the evolution of the probabil- 
ity density function (PDF) of the transmitted fraction, T a , defined 
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Figure 5. Top: Median IGM transmission functions for varying ionized 
fractions. Middle: probability density functions of the transmitted fraction 
of Lya photons for all our simulated sightlines, i.e. 50 sightlines each for all 
haloes with Mh > 10 10 Mq. The solid lines are for the GmG line profile 
and the dotted lines are for the Gaussian profile. The colour coding is the 
same as for the top figure. Bottom: Examples of observed line shapes for a 
1O 1O M0 halo. The grey line shows the intrinsic line. 



J Jem(\)F(\)d\ 

J J cm (A)dA ' 



(9) 



The PDFs were calculated over all the simulated sightlines, us- 
ing the GmG line shape model (for comparison, we also show the 
results from the Gaussian line model). Note that T a depends only 
on the shape of the intrinsic line, and not on the value of the Lya lu- 
minosity (we will discuss luminosities later on). Furthermore, since 
the blue peak is usually completely absorbed, only the red peak has 
an effect on the variation in T a with damping wing strength. The 
blue peak only affects the absolute value of T a . 

In a more neutral IGM, the strength of the damping wing is 
the most important factor for determining T a . For low ionized frac- 
tions, the damping wings are strong enough to absorb almost all the 
emitted photons, and the PDF peaks at only a few percent transmis- 
sion. For intermediate {x) m the large variation in damping wing 
strength among different sightlines becomes apparent as the PDF 
becomes very broad. 

As we approach (a;) m = 1, the distribution of T a clusters very 
strongly around 50%. This effect is made stronger by the double- 
peaked nature of the line profile. At this point, any damping wings 
will be weak, and mostly affect the transmission close to the line 
centre. But since most of the emission is in the wings, a couple of 
A away from the centre, the transmitted line will consist of most 
of the red wing and nothing more. 

A single-peaked line profile centred on the Lya line centre is 
more sensitive to damping wing strength than a line profile with 
most of the emission offset from the line centre, such as our GmG 
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Figure 6. Median T a taken over all sightlines for all haloes as a function 
of IGM mean ionized fraction, for the Gaussian and GmG line models. The 
shaded areas show the 1 cr spread among individual sightlines. 

model. This results in wider T a PDFs, as seen in the middle panel 
of Fig. [5] It also affects the average value of T a as a function of 
(x) m , as we show in Fig. [6] The GmG model has a somewhat flatter 
relation between (T a ) and (a;) m , especially at the later stages of the 
EoR. In other words, using a simplified model such as the single- 
peaked Gaussian may result in an over-prediction of the IGM ab- 
sorption. 

3.1.1 T a dependence on halo mass 

While the global mean value of T a at a given redshift is the most 
important factor in determining the change in LAE luminosity 
function and the decrease in the LAE fraction, the relationship be- 
tween halo mass and T a can affect a number of second-order ef- 
fects, such as the shape of the LF and the IGM induced boost of the 
correlation function. This relationship is complex and depends on 
a number of factors. 

The strength of the damping wing depends on the density of 
neutral hydrogen in the close vicinity of the source, and so the 
transmitted fraction of Lya will be a tracer of the size of the H II 
region around the source. Since massive haloes tend to produce 
more ionizing photons, one would expect their surrounding H II 
regions to be larger, leading to higher values for T a . Furthermore, 
massive haloes predominantly form in highly biased regions with 
many smaller-mass neighbours. These all contribute to enlarge the 
H II region and, again, lead to larger T a . On the other hand, more 
massive haloes will tend to have more infalling gas, which tends to 
quench the blue portion of the Lya spectrum and give a lower IGM 
transmission. Also, the biased regions with the most massive haloes 
will typically have a higher gas density, which — if not highly ion- 
ized — will lower the transmission. 

Earlier studies have approached this problem in different 
ways, and come to different conclusions about the halo mass - T a 
relation, [iliev et a l. (2008a) use a simple Gaussian model and find 
that infalling gas gives a lower transmitted fraction for more mas- 
sive haloes. Mc Quinn et al.| (2007 1 perform similar simulations, but 
find almost no trend at all in two of their models, and an oppo- 
site trend in their model with the largest H II regions. Zhe ng et al.| 
l |2010[ l perform full radiative transfer calculations, and find a weak 
trend with lower T a for higher-mass haloes, which they attribute to 
the higher density in the regions around these objects. 

In Fig. [TJwe show the relation between (T a ) and halo mass 
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Figure 7. Mean IGM transmission for the various line models discussed in the text, binned according to halo mass, for four different IGM ionized fractions. 
The error bars show the 1 a spread of the (T a ) distributions for each mass bin. 



for four different line models. The two single Gaussian mod- 
els refer to the models used in Iliev et al. (2008a) — one with 
a constant line width of 150 km/s and one with a width of 
150 km/s x [M h /(10 10 M Q )] 1/3 . The analytical solution refers 
to the line shape given in Eq. l|5j. In addition, we show the results 
of using the single-width Gaussian line, but starting the sightlines 
1.5 virial radii outside the galaxies. 

For our GmG model, there is a clear trend of higher (T a ) for 
higher halo mass. This seems to contradict the results of |Iliev et al.| 
( 2008a ), who found a negative correlation. This discrepancy is most 
likely due to the fact that they did not divide the Lya radiative trans- 
fer into two regimes, as we discussed in Sec. |2.3| Since the amount 
of infall is largest close to the galaxies, this can not be studied with 
the exp(— r) model, but needs full radiative transfer. In our model, 
where we start the IGM radiative transfer at 1.5 r v - lr , we implicitly 
compensate for most of the infall effects when we fit the mass-to- 
light ratio in Sec. |3.2. f| 

The green and yellow lines of Fig. [7] show the results from 
using the Gaussian model, starting sightlines at the halo centres. 
Here, we do indeed see the same trends as |Iliev et al.| ( |2008"a) , with 
lower transmission for high-mass haloes at high (a;) m , especially 
for the single-width line model. However, if we start the Gaussian 
lines at the edge of the galaxies (grey dash-dotted lines), the infall 
trends disappear completely, indicating that the regions very close 
to the galaxies have a large effect on the transmission. Since these 
regions are not properly resolved in our large-scale simulation, one 
should be careful to draw any firm conclusions on the effects of in- 
fall based on these results. In our high-resolution galaxy sample, we 
saw no strong effects from infall, but higher resolution and better 
knowledge of input parameters are likely needed to properly study 
this. Also, different reionization models can give different results, 
as seen in McQ uinn et al.| (2007 ). The reionization scenario used 
in |Iliev et al. ([2008a I did not include self-regulation effects, and 
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Figure 8. Correlation between mean T a for a given halo and the num- 
ber of close neighbour haloes (d < 5 cMpc). Only haloes with 10 < 
log M-^/Mq < 10.1 were selected. The solid line shows the mean and 
the dotted lines show the 1 a spread for different bins. 



so one should be careful when comparing their results to the ones 
presented here. 

The main reason for the positive correlation seen when split- 
ting the radiative transfer into two parts is that higher-mass haloes 
are usually located in larger H II regions. This is due partly to the 
fact that large galaxies emit more ionizing radiation than smaller 
ones, and thus ionize the gas around them to a larger extent. How- 
ever, it is likely more dependent on the fact that high-mass haloes 
tend to be located in biased regions with many neighbouring haloes 
surrounding them. These neighbours all help create large H II re- 
gions, as was also discussed in |Iliev et aT] < |2008a| >. In Fig. [8] we 
show the correlation between (T a ) and the number of close neigh- 
bour haloes (here defined as haloes within 5 cMpc, a distance which 
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typically puts neighbours within the same H II region) for haloes 
with logMk/M® « 10. Since all these haloes have the same 
mass, they all ionize the IGM to the same extent, and have similar 
amounts of infall. Still there is a fairly strong correlation between 
number of neighbours and {T a ). 

The second thing to note in Fig. [7] is that even if we fix the 
starting point of the sightlines to 1.5 r v i r , the choice of line model 
has an appreciable effect on the results. As we discussed above, 
lines with a lot of emission at the line centre are more sensitive 
to damping wing strength, while the double-peaked models (GmG, 
analytical) tend to give somewhat flatter relations between {T a and 
halo mass. 

In summary, the exact nature of the halo mass-T a relation is 
complicated, and will depend on the definition of T a and assump- 
tions about the Lya line shape. Here, we are interested in the effects 
of the IGM, and define T a as the transmission through the IGM af- 
ter the photons leave the galaxies. For this definition, we see a fairly 
weak trend with higher transmission for higher-mass haloes. In our 
model, any infall effects that are taking place close to the galaxies 
will be calibrated away when we fit the luminosities later on. 
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Figure 9. Simulated Lya LFs compared to observations at z = 5.7 by 
OulO. The dotted green line shows the best fit assuming a constant mass- 
to-light ratio. The blue line is the best fit assuming a log-normal scatter in 
the mass-to-light. 



3.2 LAE luminosity functions 

Observing the evolution of the LAE luminosity function (LF) with 
redshift can provide information about both the change in intrinsic 
properties of the galaxies and the changing state of the surrounding 
IGM. Unfortunately, it is far from straightforward to disentangle 
these effects. For example, a decrease in the LF could come from 
a decrease in star formation, a higher dust content, a denser and/or 
more neutral IGM or a combination of all three. 

The LAE LF appears to change very little, if at all, up to 
around z = 6 jMalhotra & Rhoads|2004l |Ouchi et al.|2008|[TTivTl 
|et al.| |2010). In other words, if the reionization of the Universe 
is still ongoing at these redshifts, the decrease in observability of 
LAEs due to the neutral IGM must be counteracted by an intrinsic 
change of the galaxy properties, such as a lower dust content. 

However, at higher redshifts there is an observed change in 
the LF. |Ouchi et al.| ( f2010| l (hereafter OulO) observe a sample of 
207 LAEs at z = 6.5 in the Subaru/XMM-Newton Deep Survey 
(SXDS) field, and find that the LF at z = 6.5 is different from 
the observations at z — 5.7 with a fairly high degree of certainty. 
By comparing to simulations by [McQuinn et al. (2007), |liiev et al.| 
(2008a), [Dijkstrae t al. (2007} and |Furlanetto et al.H2006| they corT 
elude that this suggests a change in neutral fraction of the IGM of 
A{x) m « 0.2. This value assumes a model for the intrinsic evolu- 
tion of LAEs based on observed changes in the UV LF. 

|Kashikawa et al.| ( [2006) and |Kashikawa et al.| ( f2011) (here- 
after Kail) also observe the LAE LF at z = 6.5 in the Subaru 
Deep Field (which is separate from the SXDS field of OulO) and 
see a difference compared to z — 5.7. Comparing their results to 
models both by McQ uinn et al.| ( [2"007 1 — who do not include any 
intrinsic galaxy evolution — and |Kobayashi et aL] ( |2010} — who do 
model intrinsic evolution — they require A(a;) m ~ 0.4 atz — 6.5. 
This assumption is motivated by the fact that they, in contrast to 
OulO, do not see any evolution in the UV LF during the same time 
interval. |Dayal et al.| (J2008 1, are able to reproduce the LF evolu- 
tion between 2 = 6 and z = 6.5 with only galaxy mass evolution, 
but this then requires a very sharp drop in Lya escape fraction to 
explain the absence of change in the LF at z < 6. 



3.2.1 Intrinsic luminosity model 

We now investigate for which ionized fraction our models best re- 
produce the observed LFs by OulO and Kail. For this we need a 
model for the Lya luminosity of our simulated haloes. |Iliev et aT] 
( 2008a ) and McQuinn et al. (2007) assume a constant ratio between 
dark matter halo mass and Lya luminosity, and fit this to the obser- 
vations. In reality, however, we expect there to be some variation in 
luminosity among haloes of the same mass, which is also what we 
see in the detailed simulations of our high-resolution galaxy sam- 
ple. Not only is there not a direct one-to-one relation between halo 
mass and production of Lya photons, but the escape fraction can 
vary a lot between different sightlines (Laursen & Sommer-Larsen] 
|2007[ >. While a constant mass-to-light ratio will give a LAE LF 
with a shape very similar to the mass function (the difference com- 
ing only from the dependence of (T a ) on halo mass), adding some 
random scatter to this relation will tend to flatten the LF and give a 
higher bright end and a lower faint end. 

We attempt to find a good model for the Lya luminosity by 
fitting our results to the observed LF at z = 5.7 from OulO. Our 
IGM simulation does not go down to such low redshifts, so we 
take our lowest-redshift halo sample, from z = 6.0, and make the 
assumption that (T Q ) = 0.5 for all haloes at this redshift. This 
is motivated by Fig. [5] where it is shown that the T a distribution 
becomes very narrow and centred on 0.5 for low neutral fractions 
(as is shown in[Songaila (2004), there is still enough residual H I 
to remove the blue part of the spectrum completely). 

In Fig.[9]we show our best fit to the observations, assuming a 
model where the Lya luminosities of haloes of a given mass follow 
a log-normal probability distribution with a mean that is propor- 
tional to the halo mass, and a fixed width. When fitting this re- 
lation to the observations, we fixed the scatter to a = 0.4 (see 
Eq.|10}. This value and the log-normal distribution were motivated 
by the simulations of the high-resolution galaxy sample discussed 
in Sec. |2.4| and are similar to the model used by Munoz & Loeb 
< |20 1 1| >- The constant mass-to-light model is roughly consistent with 
the data, but appears to give slightly too steep a bright end. The 
log-normal-scatter model, however, matches the observations very 
well. By x 2 minimisation on a grid, we find that the best fit to the 
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observations is: 



3.2.3 Sample variance 



logL Q [erg/s] = 

= log(M h /M Q ) + norm(^ = 31.7, a = 0.4), 



(10) 



where norm(<r, /i) denotes a random variable distributed according 
to a normal distribution with mean value fi and width a. 

We would like to point out that the value of fi in Eq. (jToJ 
should not be interpreted too strictly. As we discussed earlier, there 
is a degeneracy between the absolute value of T a and the intrinsic 
luminosity. Had we used a line model with a stronger red peak, our 
T a values would be higher, and, consequently, the intrinsic lumi- 
nosities lower. In this study, we are only concerned with the depen- 
dence of T a on, for instance, IGM neutral fraction and halo mass, 
and so this degeneracy has little effect on our results. 



3.2.2 Comparison to observations at z — 6.5 

In Fig. [10] we show simulated LFs for different IGM ionization 
states, compared to observations at z ~ 6.5 by OulO and Kali. 
It is evident from this figure that the uncertainties in the observa- 
tions are too large to put any robust constraints on (x) m atz — 6.5. 
Nevertheless, we still investigate for which (x) m our simulations 
best fit the data. 

To do this, we make the simplifying assumption that the un- 
derlying intrinsic LAE LF does not change between z = 6 and 
z — 6.5, in accordance with the observations at lower redshifts 
discussed above. For this purpose, in producing Fig. [10] we have 
scaled the underlying halo masses in such a way as to give a con- 
stant LAE mass function at all ionization states. This ensures that 
all the evolution in the LF comes from the changing ionization state 
of the IGM. We then go on to calculate the \ 2 goodness-of-fit, com- 
paring the LFs in Fig. [l0]to the observations by OulO and Kali. 
The best fit is for an IGM ionized fraction of around (x) m ~ 0.5, 
but the errors in the data are still too large to get a robust constraint. 
However, it is interesting to note that both the OulO and Kal 1 sam- 
ples, which come from different fields, both give the same best fit 
value. 

It is worth pointing out here, that while we look for the best- 
fitting ionized fraction at z — 6.5, this value for {x) m occurs al- 
ready at z = 7.2 in our simulations (see Fig.|2j. This makes the 
comparison somewhat inconsistent, and implies that reionization 
in fact took place later than in our model. However, we believe the 
comparison is still reasonable, since shifting our IGM simulations 
in time would not significantly alter the results, as we argued above. 

The value of {x) m ~ 0.5 at z — 6.5 is in conflict with other 
measurements as we will discuss in Sec. [4] which casts some doubt 
on the assumption that the drop in LF is due only to an increasingly 
neutral IGM. If we drop this assumption and allow the mass of the 
DM haloes to evolve according to our N-body simulations while 
assuming that Eq. \10\ holds at all times (i.e. there is no change 
in dust content, star formation efficiency etc.), then we find that 
the drop in LF can be explained with an IGM that is completely 
reionized already at z > 6.5. This, however, would imply that the 
LF continues to grow in amplitude at z < 6, which is incompatible 
with observations. It would seem most likely that the LF drop is due 
to a combination of IGM and intrinsic evolution, but observations 
of LAEs alone can not break this degeneracy. 



In the figures below, we illustrate the effect of cosmic variance and 
statistical fluctuations when observing in a limited field. We do this 
by considering three different observational boxes: 

• The full box. This is our entire simulated 163 cMpc box, 
where we assume we have the full 3D positions of all the haloes. 

• A deep box. An optimistic approximation of what might be 
observed in a future space-based survey, e.g. with the JWST. Such 
a survey will have trouble achieving a large field-of-view, but is un- 
constrained by atmospheric effects and so can, given enough time, 
obtain spectroscopy over a large redshift range. We assume this box 
is 60 arcmin 2 , ~ 1/6 the side of the full box, with the depth set to 
the whole box depth. We also assume, optimistically, that we have 
spectroscopic data, i.e. 3D positions, of all the galaxies in the box. 

• A thin box. This approximates a large ground-based photo- 
metric survey with a large field-of-view but limited redshift range. 
We assume a field of ldeg 2 which is approximately the size of our 
full box, with Az — 0.1, i.e. 1/5 the depth of the full box. Here 
we assume that we only know the 2D positions of the haloes. This 
is roughly equivalent to the Subaru Deep Field (Kal 1). 

Fig.[TT|shows the field-to-field variation in LAE LFs. The fig- 
ure was produced by splitting our simulation box into many small 
sub-boxes, and calculating the LAE LF for each sub-box. It is clear 
that sample variance becomes a major issue in a small field such as 
our deep box. Here, the field-to-field variation is very large com- 
pared to the change in LF due to an increasing neutral fraction, as 
seen in Fig. [To] and in many cases the higher-luminosity bins end 
up empty, making the bright end very difficult to study. However, 
for the thin box, the field-to-field variation is of the same order as a 
~ 10% change in (a;) m , at least at the faint and intermediate parts 
of the LF. In Fig.[T2]we show the number of LAEs above a given 
luminosity in the full box and in each of the sub-boxes. 

3.3 LAE fraction 

Several observers report a sharp decrease in the fraction XL yQ of 
drop-out selected galaxies that show Lya emission between red- 
shifts 6 and 7 (Ota et al.|2008l |Stark et al.|2010| |Pentericci et~aT 



20TT]|Onoetal. 2012 Schenker et al.|2012) . For example, 



Penter- 



icci et al.| i 201 1 1 observe a sample of 20 drop-out selected galaxies, 
and find Lya emission in only 3, and |Schenker et aL] ( |2012^ find 
only 2 LAEs (plus one possible) in a sample of 19 drop-outs. While 
there are a number of effects that could explain this, arguably the 
most obvious one would be an increase in the IGM neutral fraction, 

as noted by e.g. |Forero-Romero et al.| ( [2012). 

Here, we follow the procedure of Dijkstra et al.||20TT) to es- 
timate the decrease in iLya for an increasing neutral fraction, us- 
ing the distribution of observed Lya rest frame equivalent widths 
(REW). We begin by assuming that the REWs at z — 6 are dis- 
tributed according to an exponential distribution P Z= 6(REW) oc 
cxp ^ rew^ ) an ^ setting REW C = 50 A to match the observa- 
tions at this redshift (Stark et al. 2010). Making again the simpli- 
fying assumption of no intrinsic LAE evolution, the distribution at 
some other redshift z can be written as: 



P Z (REW) = N / P intr (REW/T Q )P z (T Q )dT Q , 



(11) 



where P z (T a ) is the PDF of the Lya transmitted fraction, shown 
in Fig. [5] N is a normalisation constant and P n tr is the REW 
distribution for T a = 1 . 
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Figure 10. Simulated Lya LFs for five different IGM ionization fractions, compared to observations at z = 5.7 and z = 6.5. Poisson errors are shown for the 
first LF; they have been omitted from the others for clarity. All LFs were calculated assuming no intrinsic evolution in the LAE luminosities. 
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Figure 11. Left panel: Variation in LAE LFs from several thin box subfields taken from our big simulation box. The solid black line shows the mean value 
for the LF, the dotted lines show the lcr variation and the shaded area indicates the range between the minimum and maximum values of the LF for each 
luminosity bin. Right panel: Same as left, but for the deep sub box. 



|Dijkstra et al.| ( |2011[ ) assume, conservatively, that the IGM 
transmission is 100% at z = 6, i.e. that P Z = 6 (REW) = 
Pmtr(REW). Since observations show close to zero transmission 
on the blue side ( Songaila 2004 1, this assumption is only valid if the 
intrinsic lines are very heavily biased towards the red wing, which 
is true in the model of Dij kstra et al.| ( |20lT| l, but not for our GmG 
model. 

To keep consistency with our previous discussion in Sec. 
|3.2.1| we assume that the IGM is very close to fully ionized at 
z = 6, but still retains a small amount of residual H I that absorbs 
the blue wing of the Lya line, so that the T a PDF takes the form 
P z= e(T a ) = 5(T a — 0.5). This assumption is actually very similar 
to that of Dijkst ra~et al.| |20TTJ, since their emission is mostly at 



the red side of the line centre. Inserting this into Eq. {TTJ gives the 
intrinsic equivalent width distribution: 

P Z=6 (REW) = 

= N [ P intI {REW/T a )S(T a - 0.5)dT a = 
Jo 

= iVP intr (REW/0.5), (12) 

so that: 

P intr (REW) oc exp (^^) ■ d3) 
Using this expression for Pi ntr (REW) we can calculate the 
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Figure 12. Number of LAEs above a given luminosity limit in our full box and the two subboxes for different ionized fractions. As a reference point, we show 
in the rightmost panel the sample of jOuchi et al.|(2010) , who detect 207 LAEs with L a > 2.5 X 10 42 erg/s at z = 6.5, in a field roughly the same size as our 
thin box. 
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Figure 13. Fraction of LAEs above a given REW for different IGM ioniza- 
tion states, compared to observations at z = 7. The solid black line is for 
z = 6, assuming (T a ) = 0.5. The observations are: Pll =|Pentericci et al.| 
(20TT) ; S10 = |Starketal.|j20T0) ; Schl2 = |Schenker et al.H2012> . 
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Figure 14. Same as Fig. 1 1 3) but using the simple Gaussian model. 



REW distribution at higher redshifts using Eq. {TTJ. Fig,|13|shows 
the expected number of LAEs above a given REW along with data 
points from a number of observations. Pent ericci et al.|(20H il claim 
that they need a change in neutral fraction of Axhi ~ 0.6 be- 
tween z — 6 and z = 7 to explain the drop in xt,ya, and Fig. 
[T3] seems consistent in requiring such an extremely fast ionization 
of the IGM. Even at (x) m as low as 0.3, the IGM absorption is just 
barely enough to be consistent with the observations. However, just 
as the luminosity functions, interpretation of the REW distribution 
is strongly dependent on the (possible) intrinsic galaxy evolution. 

Another reason that we require such a low ionized fraction is 
our GmG line model. The fact that the Lytv photons mainly escape 
their host galaxies away from the line centre means that changes 
in the IGM damping wings have a smaller effect on T a than for 
a line profile with emission at the line centre, as previously dis- 
cussed. For reference, we show the REW cumulative distribution 
function (CDF) for the simple Gaussian model in Fig. [14] While 
not radically different from Fig. [T3] the Gaussian model does give 
a greater change in the CDF when adjusting the ionized fraction of 
the IGM. 

3.4 Correlation functions 

The clustering of Lya emitters has been proposed as a potential 
probe of the ionization state of the IGM (Mc Quinn et al.||2007| >. 
The reasoning is that on top of the intrinsic clustering of galaxies, 
the presence of large neutral patches of hydrogen which absorb the 
Lya emission in some areas will cause the LAEs to appear more 
clustered. 

This is illustrated in Fig. [15] which shows the positions of the 
most luminous haloes with and without the IGM radiative transfer 
applied. Since the ionized bubbles will be located around clusters 
of galaxies, the IGM acts as a kind of filter, obscuring haloes in 
non-biased regions, and showing only the most densely clustered 
regions. Thus, by comparing the clustering from a sample of LAEs 
to a sample of galaxies selected by some other method that is not 
sensitive to the IGM, it should be possible to say something about 
the ionized state of the IGM. This method has the advantage of 
not requiring any assumptions regarding the intrinsic evolution of 
LAEs, only that any such evolution is independent of environment. 

The quantitative effects of the IGM on the correlation func- 
tion was studied by McQuinn et al. ( 2007), who found that the ef- 
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Figure 15. Projected positions for the 200 brightest haloes in our sample 
without (left) and with (right) IGM effects included. At a global ionization 
fraction of {x) m = 0.7, there is barely any noticeable increase in cluster- 
ing. For (x) m = 0.3, however, the extra clustering starts to become visible. 



feet should be large enough to be measurable in future large LAE 
surveys, using a simple Gaussian line model. |Iliev et aL| j2008a| >, 
however, concluded that the difference between the IGM and non- 
IGM case is very small and difficult to detect. In the preparation of 
this paper, we discovered that due to a data handling error the cor- 
relation functions shown in Figs. 27 and 28 of |Iliev et al.| f2008a) 
incorrectly showed a much smaller difference than was actually the 



case. When plotting the correct data we find that the results in Iliev 



et al.| ( |2008a) were qualitatively consistent with those of McQuinn 
et al.| (|2007j) and those shown in this paper. 

|Kashikawa et al.| ([2006 1 calculate the two-point angular cor- 
relation function of 58 LAEs observed at z — 6.5 in the Subaru 
Deep Field, but find no evidence of even intrinsic galaxy cluster- 
ing, although they note that the small sample size and the relatively 
small surveyed area introduce large uncertainties. Also McQuinn 
|et al.| l |2007} study the same sample and find it to be consistent 
with a fully ionized IGM. |Ouchi eTal] pOTO) measure the corre- 
lation function of 207 LAEs at z = 6.5, but while they do detect 
a clustering in their sample, the signal is not strong enough to be 
attributed to the IGM. 

In this section we investigate the potential of future observa- 
tions to constrain the ionized fraction of the IGM using clustering 
measurements. We use the spatial two-point correlation function 
(2PCF) £(r), which is the excess probability of finding two galax- 
ies in two volume elements dVi and dV 2 separated by a distance 
r: 



dP = n 2 [l + £(r)]dVidV 2 



(14) 



where n is the number density of galaxies. We calculate £(r) using 
the method described in Martel (1991). The projected version of 
£(r) — the two-point angular correlation function, usually denoted 
w(8) — is defined as: 

dP = n[l + w{6)]dn 1 dQ 2 (15) 

where dfii and dQ 2 are two solid angle elements separated by an 
angle 9 on the sky. This function can be calculated from £(r) using 




10° 
r[cMpc] 



Figure 16. Mean 3D 2PCFs for the full simulation box with varying num- 
ber of observed sources. Despite the IGM being held constant, there is a de- 
crease in clustering when increasing the sample size, since more low-mass 
haloes from less biased regions are included. 



the Limber relation (Limbe rfl953 1, or directly using, for instance, 
the Landy-Szalay estimator ( [Lan dy & Szalay 1993), which is the 
approach we use here. The 3D 2PCF obviously contains more in- 
formation than the 2D version, but it requires knowledge of the 
line-of-sight positions of the galaxies, which at these redshifts can 
only be obtained from spectroscopy. 

Let us assume that a future hypothetical survey observed N 
LAEs in a region the same size as our simulation box. Would it be 
possible to say something about the ionized state of the IGM from 
the correlation function of the galaxies alone? In other words, is 
the correlation function in our model with the IGM included sig- 
nificantly different from the same model without the IGM taken 
into account? 

It might appear most intuitive at first to simply compare two 
samples (with and without IGM) with the same detection limit. This 
does indeed give a significantly higher correlation for the IGM case, 
but the difference in correlation comes only partly from the patch- 
iness of the IGM, but mostly from the fact that the non-IGM sam- 
ple will be much larger and include also some lower-mass haloes, 
which are less intrinsically clustered (illustrated in Fig.|16[). 

In this study, we do not focus on the detection limits of our 
hypothetical surveys. Instead, we take the same approach as |Iliev| 
|et al.| l [2008a| >, and simply assume that we observe a given number 
N LAEs in each field. We then compare the 2PCF of this sample 
to that of a sample of equal size, but without the effects of the IGM 
taken into account (such a sample could be obtained by observing 
in a wavelength range that is not affected by the IGM). This way, 
we isolate the extra clustering that is due to the topology of the 
IGM, and not related to the decreased number density of sources. 

3.4.1 Simulation results 

Fig. [IT] shows the 3D 2PCFs calculated from the 500 intrinsically 
brightest LAEs (averaged over all 50 sightlines per source) in our 
full simulation box compared to the 500 brightest with IGM effects 
included, assuming the GmG line model and the log-normal lumi- 
nosity relation, Eq. |l0]l. The solid line shows the mean correlation 
function over 50 sightlines and the error bars show the la spread 
among sightlines. 

In Fig.[T8]we estimate the size of the LAE sample that would 
be needed to tell a partly neutral IGM apart from a completely ion- 
ized one for some different global ionized fractions. This is done 
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Figure 17. Two-point correlation functions for the 500 intrinsically bright- 
est LAEs compared to the 500 brightest after including the IGM. The lines 
show the mean correlation function calculated from 50 random sightlines 
using the whole simulation box. The error bars indicate the lcr spread. 
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Figure 19. Same as Fig. [IT] but showing the 2D 2PCF for the 500 brightest 
haloes in the thin box. The error bars in this figure show the lcr lield-to-field 
variance. 
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Figure 18. Difference in correlation between the IGM and non-IGM case 
for different sample sizes and ionized fractions. The solid lines are for the 
GmG model, and the dotted lines are assuming the single-width Gaussian 
line model. 



by calculating the correlation functions for a number of samples of 
different sizes and for each sample fitting a power-law of the form: 



Figure 20. Same as Fig. |18| but for the two-dimensional correlation func- 
tion. 



(16) 



For every ionized fraction and sample size, we estimate the uncer- 
tainty a in the correlation length ro for the IGM case using the 
bootstrap method (Efron 1979). In Fig.[T8]we show the difference 
between ro with and without IGM in terms of a. As was hinted 
at already in Fig. [T7] the correlation method becomes much more 
effective at lower ionized fractions. For example, at (x) m = 0.2, al- 
ready a sample size of 150-200 LAEs would be enough for a 2a de- 
tection of IGM-boosted galaxy clustering. However, at {x) m = 0.7 
a sample size of up to 500-600 would be needed. 

Fig. [T8] appears a bit more pessimistic than for instance Mc- 



Quinn et al. ( 2007 ), who concluded that a sample size of 250 would 
be enough to distinguish {x) m = 0.7 from {x) m = 1. There are 
two reasons for this. The first is our assumed intrinsic line model. 
With the Gaussian intrinsic line, used by McQ uinn et al.|p007| >, 
the IGM transmission becomes more dependent on the size of the 
surrounding H II bubble, as we saw in Sec. |3. 1 . f| which is exactly 
the effect that causes the correlation to become stronger in a neutral 
IGM. With the more realistic, double-peaked, line the IGM absorp- 
tion is more similar across different haloes, especially at higher 
(x) m (see Fig. [5}, and the extra clustering is weaker. The second 
reason is the fact that we have included some random scatter in the 
mass-to-light relation, which further reduces the clustering signal. 

As we saw in Fig. [T2] to get a sample of several hundreds of 
LAEs at (i) m = 0.4 from the deep box, one would need to de- 
tect objects down to a detection limit of at least 10 42 erg/s. While 
obtaining spectra for hundreds of targets is unrealistic with today's 
ground-based observatories, it may well be possible for a telescope 
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such as the JWST. Using the NIRSpec instrument on the JWST, an 
exposure time of ~ 30 hours would be enough to obtain Lya spec- 
tra (and thus 3D positions) with reasonable S/N for sources down to 
10 42 erg/s (E. Zackrisson, private communication), although uncer- 
tainties in the knowledge of the intrinsic line shapes may introduce 
complications in the translation of the spectra to source redshifts. 

In Fig.[T9]we show the 2D correlation functions (see Eq. (15) ; 
note however that we are showing the projected distance rather than 
the angle on the sky), calculated with the Landy-Szalay estimator 
for our thin sub-box. Judging by this figure, it is not surprising that 
OulO do not see any effects of IGM-amplified clustering in their 
sample of 207 LAEs at z = 6.5. Even with a sample size of 500, 
we need about 70% neutral fraction to reliably distinguish the IGM 
case from the non-IGM case. 

This is quantified further in Fig. [20] which shows the same 
thing as Fig. [18] but for the 2D 2PCF. As expected, a bit of infor- 
mation is lost when throwing away one of the coordinates. Nev- 
ertheless, if the IGM is only 40 % ionized at z = 7, a sample of 
300 LAEs might very well be enough to detect the IGM cluster- 
ing at this redshift. Fig. 



12 



suggests that a detection limit of 10 
erg/s would be sufficient, which is only a factor 2.5 lower than the 
detection limit of OulO at z — 6.5. In fact, the right-most parts 
of Figs . [T8] and [20] may be somewhat conservative. Since we only 
consider haloes down to 1O 1O M0, the largest samples will have an 
artificial lower-mass cutoff which makes the IGM and non-IGM 
samples more similar than if even lower-mass haloes had been in- 
cluded. Thus, the difference in correlation functions could actually 
be slightly larger than shown here. 



4 SUMMARY AND DISCUSSION 
4.1 Summary 

We have presented results from simulations of LAEs during the 
epoch of reionization and compared these to observations in or- 
der to investigate to what extent we may learn something about 
the global ionized fraction of the IGM at different redshifts. To 
simulate the observed properties of LAEs, we devised a model 
that splits the radiative transfer of Lya photons into two regimes: 
one (circum-)galactic and one extra-galactic part. For the (circum- 
)galactic part, we approximate the emerging line shape as a double- 
peaked Gaussian minus a Gaussian (GmG), while for the extra- 
galactic part we carry out radiative transfer through our cosmolog- 
ical IGM box. We take the Lya luminosity to be proportional to 
halo mass, with a log-normal random scatter, and we fit the mass- 
to-light ratio to the Lya luminosity function at z ~ 6, measured by 
|Ouchi et al.|p0T0l >. 

We have shown that a double-peaked model such as our GmG 
produces some significant differences in results compared to as- 
suming a single Gaussian, as was done in e.g. Iliev et ah] (2008a 1, 
|McQuinn et al. 1(2007) and |Dijkstra et al.|j2007) . In general, our 
double-peaked model makes the transmitted fraction of Lya, T a , 
less sensitive to damping wing strength. This is because in this 
model, most of the emission is offset from the line centre as it en- 
ters the IGM, and (T a ) will be closer to 50 %, with less dependence 
on the size of the surrounding H II bubble. 

Comparing to observed LAE LFs at z = 6.5 by |Ouchi et al.| 
pOlO) and |Kashikawa et aL|p0TT) , and assuming that the evolu- 
tion in the LAE LF is due only to the ionization of the IGM, we 
have seen that our simulations best match the observations for an 
IGM ionized fraction of {x) m ~ 0.5 at z = 6.5. This is lower than 



the value of (x) > 0.8 ± 0.2, claimed by |Ouchi et al7|poTo) l (who 
modelled part of the change in LF as being due to intrinsic LAE 
evolution) but roughly consistent with the value of (x) ~ 0.62 
given by Kashikawa e t al.|p0"TT) . We note that both the LFs given 
in |Kashikawa et all 1 201 1) and that in |Ouchi et al.|(20i0) give us the 
same value for (x) m , despite the fact that these samples are from 
two fields that are well separated from each other, suggesting that 
the change is truly global and not due to sample variance. 

Furthermore, we compared our simulations to observed equiv- 
alent width distributions, and again found that a very neutral IGM 
is required if the changes between z ~ 6 and z ~ 7 are to be ex- 
plained by IGM evolution only. Finally, we have given predictions 
for the sample sizes needed to measure the increase in LAE cluster- 
ing due to a partly neutral IGM. The two-point correlation function 
does not require as strong assumptions about the intrinsic evolution 
of LAEs, and could thus serve as a valuable independent probe of 
the ionized state of the IGM. 



4.2 How reliable are constraints from LFs and REW 
distributions? 

Our comparisons to observed LAE luminosity functions and equiv- 
alent width distributions seem to be in rough agreement with other 
studies such as Kashikawa et al. ( 201 1 ) and Pente ricci et al.|(201 1) 
in requiring a very high neutral fraction at z = 6.5 and z = 7, but 
this hinges on the assumption that the evolution in these observ- 
ables is due to the IGM only and that intrinsic galaxy evolution is 
negligible. If we set aside the many caveats for a moment and take 
the values of {x) m — 0.5 at z = 6.5 and (x) m = 0.4 at z = 7 at 
face value, this would mean a somewhat later reionization than the 
scenario we used to produce our IGM boxes (Fig.[2]l. In the reion- 
ization scenario used in our simulations, the IGM is ionized only 
by galaxies, with small sources turned off as their environment be- 
comes sufficiently ionized. By adjusting the assumptions regarding 
star formation and escape fraction of ionizing photons, it is possible 
to obtain scenarios in which reionization finished later (Iliev et aT] 
[2012) . 

In the scenario used to produce our IGM boxes, the IGM goes 
from {x)m ~ 0.4 to (x) m ~ 1 between z = 7.7 and z = 6.5. This 
corresponds to a time interval of approximately 170 Myr, which is 
almost exactly the same as the time interval between z = 7 and 
z = 6 — the redshift range for the (x) m = 0.4 to {x) m ~ 1 transi- 
tion implied by our comparisons to observations. So while the par- 
ticular reionization scenario used here does not perfectly match the 
observations in terms of when reionization took place, it seems that 
the implied rapid change in ionized fraction is not unreasonable. It 
may also be objected that a late reionization will result in an elec- 
tron optical depth too low to match the observations by WMAP 
(Komatsu et al.|2 009 ). However, a fair comparison would require 
a knowledge of the earlier stages of reionization as well. If, for in- 
stance, reionization starts early with small sources there may be 
time to build up a sufficient electron density before galaxies start 
rapidly reionizing the IGM at lower redshifts (Ann et al.|2010) . 

A more serious problem is that these high neutral fractions 
are incompatible with other observational results. For instance, 
|Raskutti et al.| (2012) study the IGM temperature in quasar near- 
zones and find that reionization must have been completed by 
z > 6.5 at high confidence, and |Hu et al.|(2010) , Kashikaw a et al.| 
(2011) and |Ouchi et al.|(2010) study the Lya line shapes of LAEs 
at z = 6.5 and find no evidence of damping wings. 

One way to resolve this conflict would be to drop the no- 
intrinsic-evolution assumption, and allow part of the change in the 
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LAE luminosity functions between z — 6 and z = 6.5 to be ex- 
plained by intrinsic galaxy evolutions, as argued by, for instance, 
Dayal et al.| ( |2008) and |Ouchi et al.| ( |2010| >. As we noted in Sec. 



3.2.2 we can reproduce the drop in LF by galaxy mass-evolution 



only, although we then run into problems at lower redshifts. 

A second solution was recently suggested by Bolton & 
Haehnelt (2012) who showed that dense small-scale H I structures, 
below the resolution of our study, may play an important role in 
limiting the observability of LAEs even when the global neutral 
fraction is low. However, their results assume a constant UV back- 
ground and it is not yet clear how this translates to the more realis- 
tic case of a fluctuating ionizing rate which is higher in the denser 
regions close to sources. 

In conclusion, it would seem that any constraints on the global 
IGM ionized fraction obtained from LFs or REW distributions will 
remain uncertain at best until more is known about the intrinsic 
evolution of LAEs at z > 6 and the role and prevalence of small- 
scale H I structures. The degeneracy between intrinsic evolution 
and IGM ionization could possibly be broken with better observa- 
tions of the UV LF evolution of LAEs ( |Ouchi et al.|20T0] l, or by 2 1 
cm experiments such as LOFAR and SKA. 

In addition to the uncertainties related to intrinsic LAE evolu- 
tion, we have shown in this paper that simplifying assumptions re- 
garding Lya line shapes can have significant effects on the results 
of simulations. Line shapes with most of the emission to the sides of 
the line centre require more neutral hydrogen to obtain a given drop 
in transmitted fraction. As an example, if we redo our luminosity 
function calculations shown in Fig.[lO]using the Gaussian line pro- 
file, we get a best fit to the observations for {x} m ~ 0.7 as com- 
pared to 0.5 with the GmG model. While few spectroscopic obser- 
vations of sufficient resolution exist at high redshift, lower-redshift 
observations (e.g. |Tapken et al.|2 007 ; Verhamme et al. 2008} |Ya-| 
mada et al. 2012) show that most lines are either double-peaked or 
with a single peak that is offset from the line centre. For our pur- 
poses, these offset single-peaked lines are equivalent to the double- 
peaked ones — the important distinction is between the Gaussian 
lines with the emission centred at the line centre, and those where 
it is offset from the line centre. While we believe our GmG model 
to be more realistic that a single Gaussian, lower- ,z observations 
do show that most LAEs have some emission at the line centre (e.g. 
|Tapken et al.|2007) , even when a large part of the emission is shifted 
away from the centre. If this is true also at higher z, it would mean 
that the GmG model (which has zero emission at the line centre) 
tends to slightly under-predict the effect of a neutral IGM. 

4.3 Future prospects 

While both the luminosity functions and equivalent width distribu- 
tions suffer from degeneracies and model-dependencies, the two- 
point correlation function could be used to put some of the model 
assumptions to the test. After correctly plotting the results from 
|Iliev et ah]j2008a| >, it seems all theoretical models now agree that a 
significant neutral fraction should give an appreciable boost to the 
correlation function. While our line model and the fact that we al- 
low some random scatter in the mass-to-light ratio make our predic- 
tions for the two-point correlation function method as a means for 
constraining the IGM ionized fraction somewhat more pessimistic 
than |McQuinn et aT] < |2007| , it still seems that the method could 
be useful once observed LAE samples grow by a factor of two or 
three. Our analysis here has been focused only on the amplifica- 
tion of the clustering induced by the patchiness of the reionization 
process. Other effects such as intrinsic LAE bias (Ors i et al.|2008} 



and environment dependent selection effects ( |Zheng et al.||201l] l 
may cause LAEs to become more clustered than drop-out selected 
galaxies, but it is hard to imagine that these effects would change 
rapidly with redshift. 

As we show in Fig. [20] if only 2D positions are available, a 
sample size of many hundreds of LAEs is required for any mean- 
ingful constraints on (x) ra . It is thus not very surprising that Ouchi 
|et~aT[ p0T0| do not measure any IGM-induced boost in cluster- 
ing with their sample of 207 galaxies. Using our model, their non- 
detection gives only a weak lower limit of {x) m > 0.4. With future, 
larger, samples it may be possible to put more meaningful con- 
straints on {x)m using LAE clustering. With a sample ~ 700 LAEs 
we would expect a 2 a detection of IGM-amplified clustering if the 
IGM is > 50 % neutral, which could be used to test the predic- 
tions from luminosity functions and equivalent width distributions 
assuming no intrinsic galaxy evolution. With 1000 LAEs, even a 
neutral fraction of 30 % should give a strong clustering signal. With 
the new Hyper Suprime-Cam at the Subaru telescope, it is expected 
that future surveys will detect up to 10 000 LAEs at z = 6.5 over a 
30 deg 2 area on the sky (M. Ouchi, private communication). A de- 
tailed theoretical investigation of such a survey would require much 
larger-scale simulations than the ones presented here, but simple 
extrapolation of Fig. [20] would suggest that 10 000 detected LAEs 
can put strong constraints on the global ionized fraction. 
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